Information-Theoretically Optimal Histogram Density Estimation

نویسندگان

  • Petri Kontkanen
  • Petri Myllymäki
چکیده

We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle. MDLbased model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this approach can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the NML-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimally Learning Populations of Parameters

Consider the following fundamental estimation problem: there are n entities, each with an unknown parameter pi ∈ [0, 1], and we observe n independent random variables,X1, . . . , Xn, withXi ∼Binomial(t, pi). How accurately can one recover the “histogram” (i.e. cumulative density function) of the pis? While the empirical estimates would recover the histogram to earth mover distance Θ( 1 √ t ) (e...

متن کامل

MDL Histogram Density Estimation

We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDLbased model selection is formalized via the normalized maximum likelihood (NML) distribution, which has se...

متن کامل

Improving Accuracy and E ciency of Mutual Information for Multi-modal Retinal Image Registration using Adaptive Probability Density Estimation

Mutual Information (MI) is a popular similarity measure for performing image registration between di↵erent modalities. MI makes a statistical comparison between two images by computing the entropy from the probability distribution of the data. Therefore, to obtain an accurate registration it is important to have an accurate estimation of the true underlying probability distribution. Within the ...

متن کامل

Improving accuracy and efficiency of mutual information for multi-modal retinal image registration using adaptive probability density estimation

Mutual information (MI) is a popular similarity measure for performing image registration between different modalities. MI makes a statistical comparison between two images by computing the entropy from the probability distribution of the data. Therefore, to obtain an accurate registration it is important to have an accurate estimation of the true underlying probability distribution. Within the...

متن کامل

Learning Populations of Parameters

Consider the following estimation problem: there are n entities, each with an unknown parameter pi ∈ [0, 1], and we observe n independent random variables, X1, . . . , Xn, with Xi ∼ Binomial(t, pi). How accurately can one recover the “histogram” (i.e. cumulative density function) of the pi’s? While the empirical estimates would recover the histogram to earth mover distance Θ( 1 √ t ) (equivalen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006